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data using a 



standard notation i 

g?) Documents IfP^^^So "^eTteSturS 
® (SlOO)are tra"!^?'^^! '?Scs (S160) by 

^^nd^ff^al^S^t.^. whici, use a 
qraph cs and texaiai results of 

^^dard notation including 
tt,e document «P°9";°^ ^ciipBon lan- 
any ambiguihes, a^°^-7of the document, 
gige. Recognized port^s ot^n 
^p^nted ^ edtoWe ^ed d J^.^^^^ 
example ASCII, are piao^^ ,anguage. wrth aD 

-HI the <lo'^'"®"l,f®p„t sharing common 
contents of an ^ernentshanng ^ ^^^^ ^ 

characteristic Bemen^^^^^^^^^ ^5140). 

example: °^^2«riements (S150). ques- 
questionable-ctrararter-elements t^^^^^^^j,^ 

'^°'^^'''*^S.S^ord-elements. ^segment- 
ments. pnts Each e ement m- 

elements. and aw-elements. ta 

dudes editable ''^g'^^a identifying any 
"•^^'SalS S Snsfom^ed with a 
^It^S^SLl of confidence. 
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recognizing textual 9; j^.^ap images, and 
„ents originally ^P^.^^'^f ' ^%'^nSion process. 

for reccing the resdte ^^^^^.'^'^tomatic transfor- 
Document recognition to ne ^^^^^.^ 

nation of paper '^^^'^^''J^fZsfcrrna^ of 

documents. It «"'«''V**'l^!!!fnte through succes- 
bitmaps into structured componente^h^^ 

sweandrecursiveinterv^^^^^^^^ 

These processes J^'J^^i. logical struc- 

acterrecognWon grap^^^^^^^ 3e„«nlic 

ture reconstruction. ^^^^^^,ef„one\.oms\n- 

analysis, etc. All these ^ ,ecor6 of the 

lerpretation. Not P--^f^^^,,%„d the ones 
^misinterpretations theyje a^^^^ ^,^.„g 

thatdokeepaiBcordha>jn ^ 

so. AS a consequence, downsff w ^^j^.^ 

generally not V^^^^^^^J^Z^ processes, and 
Juities handed to t^e-J by "P^*^^ is lost in- 

limply discard '^^^^^^^Z^-^mr^^^''' 
steadof belngexj-oltedft^^^ ^ , 

the document recogniuon lu ^^gje 
hand, the ^^^^1:^^^^ corrections 
to the user, the chore o^jf^^^^"^ of automatic 

^r^nl^^ra^nrr^-^^^^ 

dose an apparatus «""^^,"''^„„ot be ma^^^^ 
correcting characterswhjh<^nn«^^^ 

A bitmap video '•"«9«°!!J,Xtine of neighboring 
ter(s) is inserted in an ASCII data the 
'characters, thereby anowing^^^^^^ p^p,, 

characters) in ^Tr^ctert") Subsequently, with 
identrication dthe characMs). S^^^^^ ^^^^^ 

the aid of the video '^aja. ^ ^ ^ther means, 
correct character(s) ^^^^^^^ operator interao 
This apparatus and '"^^^"''^^''^^^^^^ from an aut^ 
tion to clarify any «"b.9u;^J^ ^^,3 ^sults of 

expressed by the 9^-"-^^^ ^J^^mar desaibes 
of an unknown input image, i n a 
,he image as subs^u.^^^^^^^^^^ 
betweenthem. n the J^^^^^^ are identified, a 
structures and their ^e'auv gubstructures and 

search is -^^^^^^'^Snn th: unknown Input inv 
their relative relation ex,snn^ ^^^^ 

age. and rfthey do. th^'J^'^ft^e analysis. If the sub- 
further resolved to ^^^J"* /^^J^j Jes are search- 
structures to not exist, f^^'^^n Input Image is 

ed and the structure °V'l«"*;'S"he search. For 
thus represented from the result ot I 



example, the locaUoncfar^^^^^^^^ 
documentwhichconteBisa stetem .^^ 

document g-ammar H^^^^ues. See Fig- 
THOR-) te initially J, seating this regton 

, ure10ofUS.AA907^^^^^^^ 
in the document, the approp 

substituted for the >«n^^^^^^^ 

US-A-4.94g^l88toSatod ^ ,,3^cter or 

essing apparatus ^ ^ymn jescripton 
,0 graphic ^-^:;:'^:^:'^l^eL^oprocess. 

language and ^^^^^^^J^^ descriptton language 
ingapparatusgenerat^apag ^^^^ 

induding code ^ata wJh* rep ^^^^^ 

graphics ^-^^^:^X^son^r^-^rns,e.^ 
IS wh'chcausesapnnertopni 

biguities from previous ^ ^^cription lan- 

esses are not '^^j^^ in column 4. lines 

guage. See. for a«'"P'^' "^^^g^, device receiving 
ilS. Accordingly, any dovmstrearno^ .^^ 

'^''^'^'I^l^TB^^^on processes, 
performed doc"»^""^ ^^ ^ discloses a meth- 
US-A-4.654.875 to Snh^";' optical char- 
odd automatic language «<»gn^n_^^^^ 

25 acter readers. L«"9"^9^jVn thTbasis of. channel 
orstructuresjsan^g^do^^^^^^^^^^^ 

characteristicslnthefwm ^ ig^er. the prob- 
intheinputls acorruptonof ano^ ^^^^^^^ 

abilities ofthe letter ocoirnngs^^ 
30 ognfeedlettersthatP^ced^be^. 3rta«^ 
or particular stnngs of lettere o ^ords 

lexical ^sS^Si^e. Ambiguities from 

represented as a graph ^^"^^.g pot recorded, 
u^tream recognition proce^^J^J^^^ ,^^^^„. 
35 "^^ord Association Nwrns W ^^^^p^,. 
and Lexicography-, by Kenneth w ^ 

Tck Hanks. Computa«onall^2^^S?^^ 
(March 1990) ^^^^I'^l^^^^^^rrreOoni^eo^^ 
^association ^ti^^J^^^'a^"^ e^^ word as- 
,0 notion of mutual ^^r readable corpora. 

« as possible words. ^ ^ characters of Any 
•on the Recognition ^^^"^-^^ pavlidis and 
Font and See". by Sin«n K^^^^ 

Henry S.Ba.rd. IEEE T^ran^ g_ 2 

sis and Machine ""telfQenc^. vo ^ ^los 
« (March 1987). discloses a aystem^-; ^^^^ 
Ue««textofvanousfonteands«e^^^^^ 

alphabet Thinning "J^ . run-length encoding 
formed directly on a 9«Pb ^^J^, J^^^^^^^ and other 
of the binary ''"^9^,^^,^o g Sape-clustering ap- 
55 shapes are "^^PPli"^^^,^^ a^ then fed into a 
proach. into b'n^ry f eabjres whw^ 
statistical Bayesian classrf.er.Tnis^y 
nniltiple possible characters or word 
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of the Pr^««"»''~^";!ierof systemsextetv^ichcan 
in summary, a nun*er oTsy ^^naracters. 

recognize ^^^^J;;lT^^c^^ 
words, semantics, fonte) and . 9 ^ 

mine the ""^ertamty wunw ^^^^^ 

temsrecordtheresuU o^^^^^^^^ 

eluding uncertamt.e8).najarm in 
t,y other J„^f ^^" teinty) being lost, espe- 

images l*«f*r^^^'^i,na«on ,«««»- 

.H- foreaoing and other objects, and 
To achieve the f«rego.ng ^^^^ ^^^^^ 

to overcome the ^^^T^^^^J^vided for com^erting 
„,ethods and apparatus are p^ ^ 
documents ^«P'«^""*^j'l a standard notation in a 
editaWe coded '•.-^•^^^^^^ag^^^^^^^^^^ for record- 
document desa.pt.ona^^^^^^^ ^^^j, docu- 
j„g document «co9n.tion aino g ^^^^^t rec 
i recognizer. When t^^-^^^^^^^^^^ standard 
ognition processes are re(»ro ^ ^^.^^^^ 
notation, any a"*'9";f^^;^« ger level document 

processes. .;„n the standard notation of 

,nparticular.whenus gth^s^^ 
the present invention, each ooc 
iUdtheresul^of.^^^^^^^^^ 
n^ore elements. se^ecUveiy ^^^^^ .^^^^ 3 

n,ent description language Ea ^^^^ 
type-identr.erind.cat.ng a type ed) bit- 

Tation) regarding the ^^^^J^ ^.ement also in- 
n.ap image of the type identi- 

dudes edteble coded data tne 
f,edbythetype-.denW er and ^ 

certainty '"^''^'^^T '"^^Xa predeterml^^ lev- 



termined by the <^:'^^Sym^\e.el 6c.r. 
edlnaformatthatlsreadab^ebyh^ 

stream documertrecogn^^^^ 
^tloncanindudethel^el 

, the uncertain coded date « 

document recognce^ ,„ solving amb»- 

hlgher level -""^'"^"^^^^n^ation can also Indude 
guities. The ""^^J^'^^j; each uncertain recogn.tK,n. 
Alternative coded dateft^eacn^ ^ ^ 

When the document recog g„ized 

recognizer, any -^^^Cl^ ^nf ".^^^^ are Identi- 
with a Predetermh,ed leve^ <J ,^ g„estlonable- 

fied and '^'^'^''^^^^^Z^oicertBinXy as wel 
character-elementsThedeg ^^^^^^.^^^g^e f 

,5 asalternativeposs.blecharacte g^onable 
" certainty can a'«>^ '^'J^^twi^ recognized v.ith at 
rr-^Strmirlvelof.^^^^^^^^ 

SrJldtcUer-s^nS..^^^^^^ 
When the document re«jn^^ 

recognizerlsuchaMwexam^^ 

the v^ord recognize ^Pj hethera^ 
questionable character ^y^ ^„^t.on- 
lords exist m a ^.^"^iTn characters in the wo,^ 
25 able oharacterandthe certain „ a word b 

containing each word containing a 

identified in the '^^^/^ord is identified as a 
questionable character. . ^erif ied-word-ele- 
^erif led word, and is ^^J^^^^^ are found, they 
30 ment if more than one ve jed «^^^^ 

areplacedinindw.dualverf.ed ^.ternafive- 
a« cdlectively grouP^^ X; 
word-element W ^^^^^^ character, the ques- 
word containing a 
35 tionabi^character-element e^^^^ ^ 

When the '^°'^'^^'^JS^,efr^^iye verified 
n^antics ^n^Vzer.^r^JJ^,"^^^^^^ 
words are resolved by an^V^ng ^ 3,^,. 

,ng the alternative venf led woKls^ ^ p^^^^^^. 
^ nLverifledwordscanbeco-^;'; 

mined level of "^^Jled with the surround- 
analysis.itis returned andmj^^ 
ing character-stnng-elemente. . 3^,0 ven- 

iyjer cannot deterrnine wmch o nh ^^^^^^^^.^ord- 
« ftedwordsiscorrec^rtrewnstn^ 

Ument (and j-'"^^,,;^^^^^ probabi^ 

such, and can mdude date ma ^.reotword. 
itythat each verified worft^^^^^^ 

When the documentrecogn-z ^^^^ 

ics structure '^"^-rr^^;^^^ Jresentetlve of 
elements containing coded^ 
graphics structures re«^9"^^^,,.. ,,„es defined be- 
age. These e'^"'^'"^*,''^" "tc Additionally, line 
Seen endpoints; ^^ote returned and re- 

55 thickness "J^^^^f °? r^ignWon process sud. 
corded. Amb.gu.hes.nthfreg^^ ^^.^^^^ 

rrrbrr^-datacanbeusedby 
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IS 



20 



Kiohfir level graphics recognition proc- 
downstream higher •e^®' 9?r,Mes or to recognize 

esses to resolve any ^'"^'^^V^examP'e.*^"^ 
^ore complex graph.^s^"c^^^^^ 

lines recognized « '"^'"f a higher level 

could be deterrn-ned to 4^„dpointscan 
g,aphlcsreoogn=«rrf fcjexamp^ ^^^^^^ ^ ^ 

be determined with a high oegrB 

coincident ^^nnition elements are pro- 

Additional "^^l^^^^Xti^gMevpor. 

erence to the 'o^;"^ .J^Se^and wherein: 
nS:rr.%tr;ra9t..age.sedto™^^^^ 

the present ^-^f *'°":^^_-ter-strin9-eiementfor 
Figure 2 illustrates ^/^.^^''^Sgnized with 

ZoZ w-rth a low con»'X5-t>n..e.ement 

Sa^r;^.rss"^^?^^-""* 

whichwasnotfoundinalex«o^^^^ ^ 

'T.' iSwo^sl^fn^inalexU^^ 
coHechng verified woro^ question- 

lating to a line se9m®"*^. .^^^ graphics- ele- 



Figure 17 illus^t^ a ^cument; 

l-igures ... g document; 

essary for descriDina a ^^stem for nput- 

wo«l8 processed by characteMBi^gnlzers. 

'''"'^Teometryofiinesegmentsandarcsprocessed 

by graphics re«)gnizers. ^^^^ 

Each of t*'^^ P'JSX^ (hereinafter refer- 
sumeabyte-onentedd^str^ v^^^^^ ^^^^^ 

^^J^TswJms(he.Binafterreferred 
DRstream). and bitmap sO«anw v 

I as image f Oes). ^^X^ l^^^^ or several 
« DRstream «rn^ lnforn«bo^^^^^ 

pages of a d'9*«V «Snaraohics primitives, and 

Lbestext^hfon -^-92;^:,,^^^^^^^^ ^nd 
half tone images as well 

the a-^biguitles atout^toTdoe^ any new 

The present »wenhon ao ^trecog- 
documentrecogniuon proces«a8^or ^^^^^ 

nfeers) in the characters 
^'^''f''':^^Z^^T6^e.mr.e words (by oom- 
or graphics s*'"'*'^' °' ^ against a lexicon of 
30 paring sequence 0^^^^^^^ 

known words), ordetermm ^gent in- 

ofpossiblewo,dsiscorre^H<^^^^^^^^ 

vention '"'P'^^^^ iL^tv^s of recognizers funo- 
with which these d^erenuy^^ ^^^^^^^^^^^^^^^ 

- rror'bv"-^^^^^ 

- '""C^s illustrate this document recogni- 
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^ds coded data, correspom^^^^^^ « 
recognition pro.^8wh^rtp^f«m^ 

nation, referred toj S^^^^^^^^ 
mentcontainscodeddatewn . 

as being sim»ar m characters, eta), 

graphics, same page^ ^V^^^identtf ier which in- « 

Each element '"•^'f "^.^.^ con^in^ m that ele- 
dlcates the type of coded clatecon ^^.^^^ 

ment; b) an '^!"^^ents of a document, 

amongst all simrtar JPe ^^^^-^ ^„ „ther similar 

which <«fe«"9"'^*'f *rf„irert 
type elements so that an ^^^^J^^ ^^^e an Iden- 
5^there,emen.(m^^^^^^^^ by the 

tif ication numbei). c) w ^^ngs 

document recogntton PJ^^j J, apWcs struct 
of characters or pararnetB^ d^mm^^^ ^„ 

tares): and d) ^^^'^'''^'^ZlJSw^r^ (for exam- 

butes) for P««'W "9 ^''tr 'IbSe coded data in- 
pie. uncertainty infbrn«hon)abo^^ ^^^^ ^ 

eluded in that informa- 
element can be "f«,^^ '^^S^entOnformationsuch 2S 

iionabo^'^f'^^^^^^Zce with which the 
as. for example. p^ible offsets for 

ronT.;r=^^^^ 

was determined with a >evei « " mustrated 
;edetermined level of^^'de^-^^^^^ 
examples.thecodeddate«re<»^^^^ 35 

able ASCII. how^^fJ^toLS^ia^^^^^^ the gen- 
One familiar with SGML^m ^ t,e,o„. 

eric contents of the ^^r^^^^^^^c^emeuiv^^^ 
Thus.onlyabriefdiscussionof agene ^^^^ 

be provided -th^rrbe spe^^^^^^^^^ described « 
each type of eletnent ^^J^J^J 3^.0 aiustrate a 
with reference to Figs. ^-^J/^^jVj -^n be used to de- 
complete syntaxof elementewh^h^^^ 

scribeadocumentaccordngtothe^^^ 

This Ifet of e'^-^" V-^'i be ufed by conventional « 
each DRstream. and would b^use J ^.^^ 
parsers, programmed to^^^^^^^^ 
SGMU to parse the DRs?^^^ ..^ents. a continuous 
Thatls. afterthe ^V^'J^'^^J^^^^^^^^^ document 
stream of eiemente ^^^^^^^^ tbe terminolo- 50 
would be provided. '^^I^^^.n^' refers to a group 
gy .continuous strea^ of e^^^^^^^ 

of elements which are.dentrf«o^^ ^g^^ 
er.Thus.lnamarkupanguagesuc^^ 

white-space is ^^'"'''^^^"t^^iraie lines consti- 55 
readability).tabs.bj.*a^^^^^^ ^^,3 
tate whrte-space that the pars ^^^^ ^^^^ 
sense, white-space « P^^ a ,.,„it on the 

of elements. Other systems may na 



DRstream. w^ere severaH^^^^^^^^ ,y 
longing together, is also '"tenaea ^^^nte'. 
tbe terminology "•^"^""fp^^'TsT-C include atlri- 
(Some of the elemente ^^^ J would be 

Us (to be descnbed be^^v^w^^ ^^^^ ^, 

listed at the start "f/l^^ f ^^f^ie not required to 

the elements listed in F'9^^^,^«^nition process; 
recordtheresultsofadocumenm^^^^^ 

however. whenmoreelements^P ^ .„ 

formation can be [^"IJ^JeSSts- means "define 
SGMLtheterminol(^y iELEMt" ^^^^^ 

an element whose ype s s . .^^ 

n,eans "the elemen b;9'"« ^^^^ment ends with </> 

f ier appears bracketed < J- * ^^^^^^er element be- 
(element-end market), or when an ^^^.^ ^^^^ 

Uatthesameorahjgher^^^^^^^^^ contents of this 

tare"; and "(tfPCDATA) J^^f ^ ^ pg. 2 def ines an 

elementlsachara.^rs,nn9 Jl^u^^^ (such as 

element «>"*«'"'"9^^S?ed as follows: 
-horse") which would be recoraea 
<:s>hoise</s>; or 
<s>horBe</>; or 

<s>horse ,.i<»ment can be other 

Other possible conten^ c^ ^ J^'^^^'^Untof F.g.5 
elements (see. for e^^-^P'^;*^ ^n^er^ as its con- 
which includes two or "^ore ^^^^ by EMPTY and 
tents), or only attributes (^^^'^^'^J^^ - ♦ - In- 
a attribute list - ^^^f:.!^^ deeding item can be 
dicates that the '•""^^^J'^^fbecome more clear 
repeated. These d^^^Td ,n more detaU below, 
as each element is defined m m ^.^^ ^ 

Rgu« 1 -f^-^P'^fS^I^p images which 
used to illustrate the type^ "o ^^^m of 

can be transformed ajK.^^^^^^ 
recordation, using the p^sent^^^^ 

image «nd"do« ^«"°"^^*re be«use their poor 
characters hard to «~9"^ ^^S^phicsintheform 
shapeorpoorciuamj^u^re^S^^^ 
of two line segments, bitmap grap ^^^^^^^ 
some undef ined drav«n9.1og.^l^^u 

of footnote and "^^1°; cteS-strlng-element (s) 

high confidence level (having at least 
-•'"^^rohrctrS;^^esamefon^ 

position and ""''^'""'"S/J^^Ji^^^^^ 
therelsnoslgnificantwhitegap 

Character (for instance. ci^^^^Smntf text, sepa- 
tally aligned .''f to two 
rated by a certain a^ount^w" type- 
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Have id numbers, but instead en bep,aced imager 

^'^■^^treferenc^totbeFigu.^^^^^^^^ 
that image having a f-^^ t^SideSce by s 
at .east -^^^ ^corded using the 

a character ««»9"«'^' sGI^L as follows: 
present invention .mplen^nted n^ 

<s>Etymologies appear m ^ 

^^^O^l^STWns-. in accordance with to 

, chnws a questionable-character-ele- 
Rgure 3 s»'°"!,^^?^„ecognizer places char- 
ment (qc) N«rt>ere a chara^r ^^^^^ 

actersthathave alow <^^^^^l^^J, currently de- « 
ognized. Existing ^ara^^^^^^ „ a 

termine a level of <»"<^«"^ ~ at least a predeter- 
charaoter is not -^^^^^f ..tse^aracter recogniz- 

n,ined level of ^"^l^^^^^'Zse.er, bringing an 
erssomehowtaghecharacter^o ^^^^^^.^^^ ^„ 

uncertain character totheattem ^^ 
othermatter.Somevendo^h^^^^^^ 
age where«cogn-ng anda^^^^^^ these sy. 
are intertwined, it ^ because it is 

terns tag uncertain *aracte« ^ 25 
an internal matter. ^'^^^^^^'^Jo^n^^tB^ 
away by user ^^'^''^^-^^^^ a pair of question 
the uncertain characte^^say vnt P ^^^^ ^^^^^ 

marks, creating t»^« P""fJJ„g",sh these question 
down the line «nnot distinQU^ questionable 30 
„,arks from genuine ones^ "^^^^^^^Jthat can be 
characters are n°t«~«'fi;'^.^ "Juestion marks and 
usedbyothermacmnes.ajat« 9^ ^^^.^^.^ Thus 

highlighting may ^'^^^^J""'. higher level device such 
when this data is Pa^^^ ^ '^^^^^^^ checker will not be 35 
as a spelling checker, the jpel Ung c ^^^^ 

able to utilize t»^« ;"*°'"^J^;e^el of certainty, 
not recognized with a high ''agre , 

,„ the present invenuon. a Ijg^e^v^^ ^ 

ceives the informahon that a chara ^ 
cgnizedwitha hlg deo^^^^^^^^^^ 
character located in ^ ^"^ ... Thus.byusingano- 
^entcontainsthatcharacten^K-Thu^^ ^ 

tation in a document ^f^^^Xe uncertainty 
ambiguities.otherrea)gn«erscan^^^^ ^ 

information. P-eferaWy. ea^ qc ^e ^^^^ 
questionable character ^^J^ character 
'contain a list of allernaUve Ja«°^j;;33ib,e charac 
recognizer identif ies more than one p ^ 

u beiow the ^^'^rz:^.:^^^^^^^^ 

particular portion of a ^^''l""^Zxa\H)lof<iyi^^^^ 
Weeofcertaintyfo^heone-^^^^^^^^ 

able characters f^^^J^^^er-^ewB^ts are 
ment Ideally, ''"^^honable char ^^^^ 
subsequenliy elin^mated by a spem 9 ^^^^^ « 

For example, the system descrm ^ 
•.ncorporated paper ^-^^^^^ters (or words). 

used to generate a ^er^ahT Indtoative of the 
each having some type of measu 



and/or words would be reco^edjn PP 

tinct elements "f^j'^^J^^ention. This would 
guage according the PJ^^^-J^^^^^t recognizing 
enable other, h^*'^^ '"^g' arate from and used at 
processes (which may Ja separat ^ 

a time separate from '^^^^^^^^^.Ty^e present 
cess this informatonj^^^^^^^^^ 
invention also permiteexis«ng a ^ ^^^^ 

in a more efficient '^"'^Jd uncertain character 
guishlng between certainand un ^^^^^.^^ 

;i;S?npridrs^-belimi.dtotheunce. 

tain character t°;;^^3^-uesUonable-word-element 
Flgure4^ust.atesaque^ 

(qw) into which a w«™ ^, ,3tters recognized 
recker)placeswordstha^«rtoinl^ 

with a high level of ^^"^''^^^^"•^cognizer. There is 
found in the lexicon °f *^^',,^^nt These ques- 
one questionable ^^^^^J^^i^er word recog- 
tionaWe words '^'^^'^^^^^J^com, or by some 

""^"^ "'i^sisS asT^::^'^ analyzer), to be 
other means (sucn c» « 

described below. suppose all the charac- 

recorded in aqw element as follows. 

<qw>Jumblatt</>. ^-rified-word-element (vw) 
Figure 5 i«"Strates a venneo ^^^^^^ ^ 

and an alterna-V^ST^a'-^^^^^^^^ ^ ,cund in its 

word recognizer p ac^ „„^nable-character-ele- 

auempt to ^ly'l^^J^t^teforwordslnalex- 

n^ents. The word ^^^^^uztiie character 
icon for each occurrencecrfaqu^ ^^^^^^y^ 

hasedupontheworda^oci^e^^^^^^^^ 
character-element tf awo«^^^^^^^ ^ ^ element 

word recognizer P'^^^^^^^ fc eliminate question- 
When the word recognEerti^J ^ ^^^,3^ ,„ 

able characters, it may '^^^^^f^^^^ decide b^ 
its lexicon. If the word reoogna^r ^ ^^^^ ^ 
tween the verified wordM Peaces e^ 
rw^rJi^thre^r.^a^down^^^^^ 

r;::trorrrrrc^^^^ 

alternative words. ^^^g conven- 

The word 'a~9"^^J^;jlrts to be compared 
tional processes for sele^ngw ^^j^3 3,pha. 
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provided in a *»"eshonable^cnar ^^^.^^^ye 
substitution could be '^^^lll^J^ZvisB^^^^rO. 
questionable ^^^^Z^TZ^^i remain, and 

°:^sru;^tedbvtbe^^^ 

consider, for «'«'"P^''*^'J*^ted below found 
tlonable character from F«.1.«tistrate 

<qc>a</q> , ^ 

<s>are still obscure<te> ^jedtoase- 
Theslream of elements could^supp^'^^^^^^^ 

„,antics analy^r whi^wouW att«^t^^ ^^^^ 
which word "^^^^^^ J^^ it merges that 
can determine which "^ente. For example, 

word into the «"^r"'r^^Sd to the semant- 
assume the foBowing data « piovioe 

^"t^.'ihe orlglnsof numerous English </s> 

<aw> 
<vw>wards</> 

<VNnf>words</> 
<s> are still obsaire.<|> 
and it decides f«m -^^'^^'^^ replace the 

-^c^r^rrpoai and the second 

rt^l'o^- of numerous English <s>wo,ds<s> 
r ^sITnumerous English words<s> are 
S°r<S^ of numerous English <s> wo«.s are 
tXrStted that the inte^^^^^^^^ 
11 omitted since ^^^^^^^^.^ which is used 

,0 collect chjacter^^^^^^^ 
of the samefont. Attwiei elements, and an 

lowing it to be ^^^^^^^f ^^^J^^^^ (def ined below). 

optional re^e«"" *° t he most recently 

Kthefontreferenceisnotsup^^ 

supplied one isused Th^tex^e^^^^ ^^^^^^^^ 

rTn»-----"^^ 

. ■'^^*'°lSid = 123fbnt = 2>listofs.aw.qcandqw 



«'^C^^i«us^-afon^^;^-^^^^^ 
analyidbythechja^--^^^^^^^ 
corded in fontDefelementewrtn^ t^ 

5 as possible. The <^of"^°^!rr recognizer is able to 
fontfamilyname.Htheo^a«^^^^ 

derive R with ~f ^^e contentsisleftem,^ 

" essorlntera^elybyj^^^^ 

The Id-attnbute enaoie .^^^g^^^ 

encefbntdescnptKjns^ThesEe. ^^^^^ 
in points. The base-at«nbute ^t»c ^.^^ „ 

base line is of feet by -"P^^^Sbl indicates the 
,5 thereisunderlmmg. theund^aw ^.^^ 

position of the ^^^^'^^^^^^i ,„ a fontDef-ele- 

rntr;:r.SfXn----'-^"- 

Note that the attnbutes are re 

brackets <>• ^ , _ „ -aament-element which Is 

Figure 8 »'"«'^^*«f,^!!JsMment-elementsare 

onetypeofgraphic^^e^;^^^;;, ^ ^ 

,5 u»«l'V*^Ss^rbi^3pirnage.TheW^^^ 
mentsitrecogntosftomu^ ^^^^^^ 

tribute enables higher e>»r?~ ^ the ending points 
n^ent-element The '^"^^^^'^e top Wt corner of the 
(x1.y1andx2.y2).relatwetothetop^ ^ 

certainty aboutthe e/aj J"^^^ J^, dy2-attributes. 
^ed in the dx1. yjj"^^^ 'possible offsets 
Thus. dx1. dy1. d^ and ''^;^~;ed to describe the 
oftheparameters(x1.y1.^^) The segment thick- 
35 llnesegmentgraphiys^u^^J^^^,^^ 

-irbrre^i--r'^"^ 

^ dx2 = 5y2 = 216«hick=17><^ ^^^^^3p^^ 

AS with the ^on«'«*-«'^"^"i^^^ete Since the seg- 
vided within the first f ^^^^^t^baracter strings 
.ent-elementdoe^^tco^^^^^^^ 

(its content is EMPTY), "le ^.^^ ele- 

« owed by an ^te-^"*"""? "l^^dbyan^ 

ment-end |;\^^:3t„^Xent. which is an- 
Figure 9 .llustrates a" /^,^elements are 

other type °'.97'':^ra't e^ipsesandelllpt- 
used to note "T^^he W^nap image by the 
ical arcs recogn^d frj"^ J^^^^^e' enable higher 
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short -f^^:^'::^,T^Cn the vertica. 
thetal. ''^^'^^•^.Xuqh the center and one 
axis and the line Pa^"9 throu^S^ present 6 

of the end points of tlie ^^^'^^^ in milliradl- 
for arcs only. The angle can be measu 

theta2.dTheta2:sameastheta1.dTheta1for 

the other endpolnt: between the vertical 10 

thetaO. dThetaO: f "f J^^^^^hte attribute te 

» 1 5> <> ^ ;«.flnf^element which is a 

Figure 10 illustrates an. rnag^eem^^ 

third type of graphic^e^em^nt-T^.^^^^^ ^^^^ 

used to denote a j^^ng^^ ^'ttructured graphics, 
has not been resolved astert « 

and is therefore lef .n b^njJ^^J^e the file. 
The image element contems the n^^^ 

Theimageelementatobutesenw ^^ ^^^^^ 

and uncertainty f »°^^JSons (w. dw, h. dh) 
page (x. dx. ^J^^^^ explessed in bi«. 

of the image. The ^^^°'^-unite of measurement is 
perunitofmeasuremenUtteu^rts^ 30 

LppUedbythedsSjeam^^^^^ 
At the onset of the aocun 

tion. the D-^^^rt^^^lrofre papers 
ments. one perd.g.t.zed page of t P P^ 

Gradually, as <=haracterjtmg^;^^^^ techniques), 35 

arcs are -^^:^ZZ ^-^^^ ""'^"'^ 
the bitmaps ^^^^P'^^^e completion of the oper- 

480>SquiggleO ,«,t-element. which Is a 

ngure 11 ■'»"^';:!^ jllS Spotelements con- 
fourth type of g^P^^-t^otet^^ery small rectangular 
tainsmallimagesand<ien<,te^^ 

area left in bitmap ^^''"^'•^■^bols etc. The bit- so 
Tmudges. ci-ngbats. "okn^^^^^ ,„,oded 
map is small eno^S'';*^ J J™^ ^ the contents of 
conveniently in hexade^jn^l form a^ ^ ^ 
the spot-element rather than earn 

J J »itrihutes supply the position 55 



. Gives the number of 1-bit 

of eight. The ^y-^'^^^ZZf^^'^'^^'"^^^^''' 

propriate. . ^he Fig. 1 saniple 

a small »"a9« »»«*^- eso W - 25 W = 25 

<spot Id = n * 

>03FFB000...O „fprences to other ele- 

Figure 12 ^"S'^^^^^^^f ^^^"Tmage and spot- 
ments. The tej -^-J higher-level 
elements may J«J°"J^^es and pages. discussed 
elements (text blo*s. ^ jifler. Areference 

below), via a J^J^^^tn itemrelement. the 

to a single ete-^fVtTS^th'e value of the identifier 
single attribute of which has tne v 
of the referenced elerne|«. ession of ele- 

Areference to a "'""^'^ntWand'to-at- 

ments is made by a ^j^";^^^ first and last ref- 
tributes referto the WenW^rsot ^^^^^^^^ 

erencedelemente."Fj«rand l^^^^ ^^^^^ 

Chronological °"'f'"^:,e^nt is a short-hand no- 
,ntheDR8tream.Arange-e^ 

tation for an unbroken suoc^s'°n 

Ambiguities about fl^P"^ are used by 

tern^lements. 'jratmbSrSonable element 
processestoencodea number ^ 

Up-ings-Forlns^^o^^P^-^^ 
ognizedashavingfourt^ ^^^^p^^or 
and two on the nght, he logi ^^^^^ 
(or logical reconstructeO. unaWe ^ ^^^^^ 

lext reads as two '"'""'"^JJ^**^^^ 
themlntheortertopleft^bo^^^^^^ 

right; ortheordertopleft^ 9 ^ ^^^jy^er 

SX==i:S;Sn.tB,ocK.e.e- 

Figure 13 an invfelble 

ments encode '^^Z^Z^^o^bI^^^^ spaced 

boundary around « fxtUna °Jf^'„,3^e to 

textlines. The locabon ^J^^^^^,^^ are gW- 
,eftoornerofthepage.andU^e.^^^^^^ 

en by the x. y. dx and dy-aW]«"^ ^ dh. 
:„dLcertain«esa,«^^;^«/,J^^ between 
The interl-attnbute ^^^^^.e block; value is 
the equally-spaced line only. The 

zero when the te^Wo^ ^^^^^^ 

-C^^Tilustra^a^^^^ 
element encodes a rectengulaj^^ ^. ^ ^^^^^ 

equal to the page a^a^ ^-sj^ „t. as well as 

blocks images ;P°^^^^^^^ 
other frames. F ani^i^y e„t. a page- 

Figure ^5 S,%;e i^eces of Information 
element aggregates an xne v 
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Figure 16 ^'^"^^^^^Joi L e\er.eu^s across 
element enables ^.^'^^te ^sed by the logical struo- 
rrn"rS;rres«.a^cana.yzerto.-. 

rerflowone^^-^^^^^^^^^^^^ 

Figure 17 i»"S«^3%,fi„iaon. is the drSlieam 
top of the document „3^e of the 

element "tlfS^h^^S*'""* '^The 
measurement unit "^f ° " ''^at fraction of the 

■r,e t^ctto-f^ltt^e'c^'X^ dimensions and 
measurement unrt the MOT ^^^^ For example, rf 

attributes are: -peter fraction = 1000000> 

^'''^rcmil.^^s^ioftheelementsused 
Figures 18A-C.nusl« g^ 

. in the disclosed P^S^'J^^Xument.ecogrt sys- 
Figure 19 illustrates ^ Fjgs. 20 and 

temusUlewiththep^^^^^^^^^^ 
2lareflowcharts.nusfraW~^^^^^^^^ 

theF«.19system^«',^V(SlSo).ap 
,nordertolnputebrtn^apm«g^ 3,100 to prt^ 
ment is scanned "^ "9 ^1!^ 110. ttis understood 
duce a bitmap documenUm^S ^^^.^ed essen- 

that the scanning P^^,.'f;,eco9nilion processes 
tiallyatthesamet.,netha«hyeo^^^ .^^^^ 

are performed, or the b*^P ^ storage 

can^ supi-ied on jo^^j/f^ disk. The bitmap 
medium such as a ha d or fto^Y^^ ^ «,„ventlonal 
document '"^age « 33 ents the bitmaP 
segmenter 150 (SIIO)^"^; ^^,, as. for example 
Jage into smaller sub^f^ text, and graphi^ 

textual subimages fO"^'"'"J ^ics. The segmenter 
subimagescontainmg orty graphs ^ .^^^3 ,^ 

150 can 'teratively segment t^^^^^ 
smaller sublmages^nulea^^^^^ 

as containing only ^^'''"So a structure. mage re^^ 
subimagesarethensupp^-edtoa^^^ .^^^ 

ognizer(orgraphicsrec^n^e02 ^^^^ ^^^^g„,^3, 
subimages are ^"fP';^,^„tn advance that the bit- 
300. Of course, rf ^'^^^^^Z only text or graphics. 
,ep documenl^ge «mte.ns^ J,^^^^ .^^^ ,3^ 

it can be supplied ^^^^ i^er 300. 

ognteer 200 or cha^^'^^"^'^ .^3, 20O then trans- 
The structure image recogn^^^^^ 

,ormsthebi.map9^g-^^^^^ 
codedgraphicsdata(S^60) ^ ^i^s-elements 

g^phics-elements such as ^^^^^^^ ,a„. 
described above, using a 00 .^^^^^ 3^3 

nuage. That is. ""'^S"/?^." a Jype elements 

Jsi70): in.ag^^'^'^^n^J Is hexadecimal values if 
elements and ^^^^""^^^l^Zaae or subimages 
rre^SgSir^a^^redln^ 



data, they are in^r- 
ments and/or a^^^'^^^^f Sample. possiWe 
mation regarding uncerte^^^^^^^ 
, -;S:^rrs^onal.ca„bere.rded 

ages Into editable «>t?ucW« ^ ^ ^„,^3,200 
vLon.thegraph^^t^u^^^'^;^^^^^ 
,0 actsasafirsttransformawnrn^ g^^^ 

" first transformation <^^^ bitmap image into 
image to t«"^TJ 3renB Sntainlng coded fta 

one«moregraph.«jder«*nts 3^i^,den« 
defining graphics sfructo^s. a 

,5 tionmeansusingthedo«^menta ^^^^^ 

for identifying the rj^^n means, each 

lansformed by the ^'-^.^"^^'^irmentty^ identtfier 
graphics-elementmciudmgar. 

fndrcatingatypeof^ed^a^r 9^^3,3^ntW 
20 „ized bitmap "^^^^^^ determines that the 

the first transfom2°"'J^^;%i^^^^ 
coded datacontamedmthe^raP^^,^^ 
been transformed >wth a preow ^^3 
rnce.theiden«ic^;;^ea^^^^^^ 

tlThT^ded d^i contained in each graphics ele- 

Hechara^rrecogn^^^-t:^::^^^^ 
30 maptextualimage(orsubim^es).n^^^^^^^^^ 
rdata(S120)which«then^tored^^_^^^g^^ 

element in S140 ^^'f j^^^ ^bove. In order to 
tionable-characte.) ^ S« Sled character date «i 

determine ^^^^♦'^l ^^3^for^ questionabie-charao- 
,5 acharacter-stnng-elementwaq ^ ^ 

ter-e,emen..adetermm^^^^^^ 
whether a recognized cnara confidence. Al- 

at least a P'^^^^^'^"^"^;' J^^^cter into a question- 
though the °" ° 33*itoconveyuncertainty 
« aWe-character-elementserv^ additional informa- 

^ information a^^^^'^S^'ncertaln characters 
lion such as alternate poss'°' characters can 

or degrees of <^.'^^'\^j;^a.^ratier-e\^'^' 
aisobe included inaque^«"^„.^, 3,0 ^„ p^o- 

^ (S155). Thus, the chara^«^,^^„,3andques- 

r^h^L--^^^^^ 

P'-e^o^drco^S^^ 
^ lexiconofwordsthere.n.The^J^^^^^ 

eratingaccordingtothep jsert ^.^^^^ ^1 for 

perform the P^^^y'^i^^'Sement First, in S200. 

'each --^^^^ 

a ^urality of ^''^^^^^^^^^oSilement in the word 
55 for the questionable-character^^ , 

containing the Questionable ch^ 3, of 

I210. a determination -s madeas^,^ .^^^ ^^^^^ 
the words formed by the sud 
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such words are referred to ^ ve ^^. 

verified ^"^240 0^^^^^^ ^245. 

element is returned m S240 J que»- 

the uncertainty "^td based upon 

tionable-chamcter-element .s updat^d^ t. 

any determinations ^^^^q ^positive, each 
400. If the determmaton m S210 P^ ^^^^^^ 

verified word is placed .n 
(S220).Next.inS230.ifnjo«thano^^^^^ 

elements ^^t.^Z^'Je<i->r.or<i.e^e. 
character-element, tne Q^-element 

Each ««»«"^^;r.8tring-element by a semant- 
formed into ^ chara<*« sj^ns .^^ ^^..^^^ 

ics analyzer 500 ^'°f„^'J2t^„Sive-wonJ-element 
of the verified words in f««^"^; „ the se- 

is correct based upon ^"^"""J j^^tehof theveri- 
n,anticsanaly.ercan«j^^^.^^^^ ^. 

fied words in an ^^^Z^^etr^uX, ar.<i op- 
rect. it returns the ^''^'^^rSmation for each of 

rsrv^r;^ le^fie^ 

^'^^tus. When -^"-'!^,rp:,S3o: 
into ediUble coded -"J^*^ transfor- 
the character ««>9"^^^° ^ f transfbrmatlon 

operation on the !L_ more elements 

the textual bitmap ^"^^^J^^^°^Z a f irst Wen- 
eontaming -^;^:^:^^',^;^^6eso.p^^ la-v 
tif ication '"aa"^"^'"? ' „ more elements trans- 
guage for '^^n^yXXnSt^ ".eans. each ele- 

formed j'^^/'f^^^f "type '^"^^"^ " 

ment indudmg an element lyp ^3 ,gcog- 

type of coded '^Z^ZZ^^r.lee^e^^ 

nlzed bitmap textual '"^^9® „„,„coanizedwltha 
Elements containing characte^notrec^n^^^^^^^ 

character-string-elemente. ^^^3. 

The word •«'=°9"'^*'^°j;^each questionable- 
formation means for tran« ^ 

character-element ^nd adjaoent ^^^gble- 

nized characters m a same ^^^^^^^^ ^ords by 
character-element irtooneormo«ven^^^^^^ 

substituting ^''^'"^^'^^.^^XTthata wor^ 
able-character-element and ver«7 "9 

:ultingfromthesubstituttone«ste^^^^^ 

a second identific^o" "/^^^erif ied word in 

description languagefor^^^^^^^^ 

a verifled-word-element When m questlonable- 
«ed-word-e.«ne"^are.^aed^ 

character-element, the seco ^.ed-word-ele- 
also places the more than one vem^ 
ments in an alternative-word-elemenL me 



5 plied to semantcs artyzer5U^ .^^.^ ^^^^^^ 
for determining wl^diverrf^dw^ 

ave-word-element .s « P^^^^^^^ath/e-word-element; 
words 8""°""*"9.';Lm?ansforldentifyingthe 
and as a third ^^^"^J^^^^ ^Z^^q the alterna- 
,0 correct verified wo,^and fo^^^^;2rin9-element 
tive-word-element with a cnararae 
dning the correc* verif ied word. 



15 Claims 

map image. 



40 



45 



3. The method of Claim 2 wl^relj^sa^^^^ 
saldsublmagecontamsagrap^^csbm^ y 
saidfirstreco9ni.J^2^a^«P^^^^^^^ 
type "fi^^^^^^ image trans- 

structures. 

10 
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.or^atlonsanddegreesof certainty be,ngde.er- 
mmed by said first recogneer. 

.K A of daim 1 wherein said uncertainty 
The method of dairn ^ | transformations 

'^''''''''T^TTJ^^^-' recognizer, and 
first recognaer is a gong of said text 

said type identiner^en«.^ P^^^^^ 

bitrnap image or cues- 

ognizer as character sm g ^^^^^^^ 

tionable-character-elemen^^ consecutive 

string-element <^°« " ^^laracter recog- 

'^^'^^SSTefsrSid Prettermin^^ .eve. of 
nizer wrth at teast « Jy^^ye-character-e^e- 
confidence. eacn m ^^j-tw informatkin de- 

acter recognizBr- 



20 

nized with at least ^ ; ... ( uncertain char- 
^'■'"S^cTatrrarp^^^^ uncertain char- 

,0. The .ethod ^ f ^ Sntyl^ 
fionabie-character-e^ement sa^ 

« formation pert^n.^^^^^ ,3,,, of 

nized with at 'east saw ^ confidence 

^5 ■ characters. 

---r:r-s^ror: 

„,ore graphics P ^i^-^^of sa'<' °^ 
- -""^'t^StS^ tSllg analyzed as 
more text b<»^P^"Trfsaidoneorinoregraph- 
texlportions;andeach^je^ ^ ^^^.^ 
ics subimages being ana^ 
...cture an^y^J to ^n^^J^,, graphics- 
25 graphics subunages inxo oi 
elements. 



usmg a word ref°9"^ . ^ adjacent confi- 
tionabie-character-e^emert and ^ 

dently recognaer J^J^^^ent into one or 
said q"«stionable-chararter 3,. 

more ver«-.ed->^^^,^"^^^^^ 

ternate characters fbr saioqu ^^^^ 

ter-elementwhen ^"^"^'^^'Csaidwordrec- 
sawsubstitutingarerec^gnaedb^^^^^ 

ognizer. when ^eaoMue^^^^^^ 
n^entistransformjltoreac^^^^^ 
acter-element,sadmorethano ,^ 

Sn^^etr^riedwordsarerec^gn^ed 
by said word recognizer. 



40 



e. The.ethodofd.m7^^^^^^^ 

foreachalterna^ew ^j^of 
seman«« ana^^to^;^;:;,^^^ each ai- 
the verif led-word-eiemenB ^^nQ-e\e- 

ternate word ' ^„^?SWified words 

„,ent corresponding to ^e^^^^^^^^ 
contained in s^d au^^j w^^,^^^ that said 
said semantics analyzer " . Q^d.said 

one of said vedf jed wordj f X„ '^one 
alternative-word-element reniainiiia 
oS verified words is determined to be a cor 
^?:«»d by said semantics ana.yzer. 

9. The method of Claim 6. Wherein for each aues. 



1- ---^^^r^te ^"^^"^ 
ttansforming '^'^"•^"'J.nided date stream us- 

30 .magedateintoanediteb^^dedo ^^^^ 
ingadocumentdescnptionlangu^ 

tarnation regarding ""^^^^J apparatus 
ment transformation process, saio 

comprising: h-,„infl- 

^^■'^^'^^"'"Lmea^forperform.nga 

determined level « y element 
cation means also •"='"'''"9 ffe^ by said 
uncertainty information determ^ed 
first transformation means reflarmng 
'rld^taconteined in said element 

55 



tainty information Indudes a cu 
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with which said f irst transformation means deter- 
mined said coded data. 

14. The apparatus of dain, 1^ ^^"^^^Z 
mains for a portion of said bitmap «nage. 



10 
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§ Etymologies 

Etymologies oppeor in ^''"r^.^'SWe UC's*' 

. Obsam Orion. Accortfinq to the noted 

linguist :JSl^ ^ ^ 
words are still obscure ... 



AHD 



Americon HeritageDictiononf 



FIG.1 
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F1G.5 
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FIG.12 
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<!DOCTYPE drStreonl 

.0 IpqqbI frame 1 group I motkl 
<!ELEMENT drStrenra jext | segment | an | 

fontDef I inio9e|spotlf 

<IATTUST drStream 

unit Inieterlpointl.nthl 

-frocf.on of measurement un.t-> 

fraction mB 



— meosurement unit— > 



<!ELEMEMT page 
<!ATTLIST page 

y NUMBER 
h NUMBER 



<! ELEMENT 
<1ATTLIST 
id 

X 

it 

y 

M 

dw 

h 

dti 

<! ELEMENT 
<!ATTLIST 
id 



frame 
frame 



ID 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 



.0 lulteml ilemlrongeh 

^REQUIRED 
0 
0 

.0 Inlternlitemlrongeh 

^REQUIRED 
0 
Q 
0 



group 
group 



ID 



<!ELEMENT tBlotk 

<IATTLIST tBlock 

id »o 
, NUMBER 

NUMBER 

„ NUMBER 
NUMBER 

V 



-0 \altern| itemlrangeU 

^REQUIRED 
-0 |altern|itera|rttngeP 



tREaUIRED 
0 
0 
0 



J 

F1G.18A 



—width— 



— abscissQ- 
-error on x- 

— ordinttle- 
-error on J- 

.-error on w- 
—height 
—error on h 
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dw 

h 
dh 
xl 
dxt 

yi 

interl 
dir 

align 



MUHBER 
NUMBER 
NUHBER 
NUMBER 
MUMBER 
MUMBER 
HUMBER 
HUMBER 
HUMBER 



0' 

0 

0 

0 

0 
0 
0 
0 
0 



-abscissa of 1st thar in block- 



— inlerline- 
.-text flow dircctiott- 



Ihoriz 1 verticl 
Heft 1 center 

FlG.ISAcont. 

-one element identifier- 



<! ELEMENT item 
<!AmiST item 
r 



IDREF 



<!ELEMEirr range 
<!ATTIIST range 

from IDREf 

<!ELEMEKT oltern 



.iwo element identifiers- 



-0 EMPTY 
I^REQUIRED 

-0 EMPTY 

tt^REaUIRED 
ttREttUIREO 



<! ELEMENT spot 
<!ATTLIST spot 
id 

X 

y 

dy 

bx 

by 

V 



ID 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 



-0 ItffCDATA) 

«:REQUIRED 

0 
0 

0 
0 
0 
0 

J 



_hBxadecimol volue 



FIG.18B 
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< I ELEMENT image 
<!A1TLIST image 
id 

X 

dx 

y 

dy 

w 

dw 

h 

dh 

resol 

<! ELEMENT arc 
<IATTL1ST arc 
id 

X 

dx 

y 

dy 
r 

dr 

rShort 

drShort 

thick 

dThick 

thetaO 

dThetoO 

thetal 

dThetal 

thetttZ 

dThcto2 



-0 



ID 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 



ID 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 



IttPCDATAl 

4M^EaUIRED 
0 
0 
0 
0 
0 
0 
0 
0 

300 



.image file naraE--> 



> 
> 



-0 EMPTY 



itREaUIRED 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 



F1G.18B cont. 
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<! ELEMENT segment 
<1 MIL I ST segniant 

id 

xl 

dxl 

yi 

dyl 

x2 
dx2 

y2 

dy2 

thick 

dTtiick 

<!ELEHENT fontOef 
<!ATTLIST fontOef 
id 

size 
weight 



posture 

base 

under 

<! ELEMENT text 
<!ATTLIST text 
id 

font 

<! ELEMENT s 
<! ELEMENT aw 
<! ELEMENT vw 
<! ELEMENT qc 

<! ELEMENT qw 
> 



ID 



NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 

NUMBER 



-0 EMPTY 

«REQUIRED 
0 
0 
0 
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